Analysis of Rebuild Processing in RAID5 Disk Arrays

نویسندگان

  • Alexander Thomasian
  • Gang Fu
  • Spencer W. Ng
چکیده

RAID5 tolerates single disk failures by exclusive-ORing (XORing) the blocks corresponding to a requested block on the failed disk to reconstruct it. This results in increased loads on surviving disks and degraded disk response times with respect to normal mode operation. Provided a spare disk is available, a rebuild process systematically reads successive disk tracks, XORs them to recreate lost tracks and writes them onto a spare disk, thus returning the system to its original state. Rebuild time is important since RAID5 disk arrays with a single disk failure are susceptible to data loss if a second disk fails. According to the vacationing server model (VSM), rebuild read requests on surviving disks are given a lower priority than external user requests, so as to have less impact on their response time. Given that disk loads are balanced due to striping, rebuild time can be approximated by the time to read the contents of any one of the surviving disks. The analysis of the M/G/1 queueing model of VSM, given in this article, is more accurate and yet simpler than a previous analysis, but it also takes into account the effect of disk zoning explicitly. We also present a heuristic method to estimate rebuild time, which can be combined with the new analysis. The ability to quickly and accurately estimate rebuild time is useful in computing the reliability of RAID5 systems, especially during design tradeoff studies. The accuracy of the various analyses to estimate rebuild time are checked against detailed simulation results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rebuild Strategies for Redundant Disk Arrays

RAID5 performance is critical while rebuild is in progress, since in addition to the increased load to recreate lost data on demand, there is interference caused by rebuild requests. We report on simulation results, which show that processing user requests at a higher, rather than the same priority as rebuild requests, results in a lower response time for user requests, as well as reduced rebui...

متن کامل

RAID5 Performance with Distributed Sparing

Distributed sparing is a method to improve the performance of RAID5 disk arrays with respect to a dedicated sparing system with N + 2 disks (including the spare disk), since it utilizes the bandwidth of all N + 2 disks. We analyze the performance of RAID5 with distributed sparing in normal mode, degraded mode, and rebuild mode in an OLTP environment, which implies small reads and writes. The an...

متن کامل

Rebuild Strategies for Clustered Redundant Disk Arrays

RAID5 tolerates single disk failures by recreating lost data blocks on demand, but this results in the doubling of the load of surviving disks for pure read workload. This increase may be unacceptable if the original load was high. Clustered RAID (CRAID) with parity group size G smaller than the number of disks (G < N) was proposed so that the increase in load is α = (G− 1)/(N − 1) < 1, but thi...

متن کامل

Data allocation in a Heterogeneous Disk Array (HDA) with multiple RAID levels for database applications

We consider the allocation of Virtual Arrays (VAs) in a Heterogeneous Disk Array (HDA). Each VA holds groups of related objects and datasets such as files, relational tables, which has similar performance and availability characteristics. We evaluate single-pass data allocation methods for HDA using a synthetic stream of allocation requests, where each VA is characterized by its RAID level, dis...

متن کامل

Clustered RAID Arrays and Their Access Costs

RAID5 (resp. RAID6) are two popular RAID designs, which can tolerate one (resp. two) disk failures, but the load of surviving disks doubles (resp. triples) when failures occur. Clustered RAID5 (resp. RAID6) disk arrays utilize a parity group size G, which is smaller than the number of disks N , so that the redundancy level is 1/G (resp. 2/G). This enables the array to sustain a peak throughput ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Comput. J.

دوره 50  شماره 

صفحات  -

تاریخ انتشار 2007